Method and apparatus for implementing high speed signals using differential reference signals

ABSTRACT

A device contains a first device and a second device. In one embodiment, the first device drives at least three signals, a first reference signal, and a second reference signal. The second device, which is coupled to the first device, receives the at least three signals, the first reference signal, and the second reference signal. The second device identifies values for the at least three signals according to the first reference signal and the second reference signal.

BACKGROUND OF THE INVENTION

[0001] 1. Field of the Invention

[0002] The present invention relates generally to the field of computersystems. More specifically, the present invention relates to high-speedsignaling processing.

[0003] 2. Description of the Related Art

[0004] With rapidly development of processor technologies, a faster busimplementation is needed to transfer data or control signals betweenprocessor components. Typically, a system contains multiple busesincluding processor and system buses and generally the buses are slowercomponents in the system. Thus, in order to optimize a high-speedperformance processor, high-speed buses are typically required.

[0005] A first approach to improve bus performance is to employconventional scheme of differential signaling bus. A problem with thisapproach is that the differential signaling bus requires two additionalreference signals for each data signal. Thus, this approach increasesbus wires by at least two times, and consequently consumes a largeamount of power and chip space to operate the additional wires.

[0006] A second approach to improve bus speed is to use conventionalscheme of differential signaling bus where the reference signals aregenerated locally. A problem with this approach is that most of thesignal margins needed to trigger the sense amplifier may be lost at thereceiving end because the power supplies for the driver and the powersupplies for the receiver are located far apart. Thus, the signalmargins for this approach are required to increase and, accordingly,more power is required to operate this approach.

SUMMARY OF THE INVENTION

[0007] A device contains a first device and a second device. In oneembodiment, the first device drives at least three signals, a firstreference signal, and a second reference signal. The second device,which is coupled to the first device, receives the at least threesignals, the first reference signal, and the second reference signal.The second device identifies values for the at least three signalsaccording to the first reference signal and the second reference signal.

[0008] Additional features and benefits of the present invention willbecome apparent from the detailed description, figures and claims setforth below.

BRIEF DESCRIPTION OF THE DRAWINGS

[0009] The present invention will be understood more fully from thedetailed description given below and from the accompanying drawings ofvarious embodiments of the invention, which, however, should not betaken to limit the invention to the specific embodiments, but are forexplanation and understanding only.

[0010]FIG. 1 is a block diagram of one embodiment of the processingunit.

[0011]FIG. 2 is a bus block diagram illustrating one embodiment of a busscheme.

[0012]FIG. 3 illustrates one embodiment of a sensing device.

[0013]FIG. 4 is a block diagram illustrating one embodiment of a busconfiguration with pre-charge and equalizer circuits.

[0014]FIG. 5 is a timing diagram illustrating an embodiment of a processfor implementing the pseudo differential bus scheme.

[0015]FIG. 6 is a flowchart illustrating an embodiment of a process forimplementing the pseudo differential bus scheme.

DETAILED DESCRIPTION

[0016] A method and an apparatus for implementing high-speed signalsusing a mechanism of pseudo differential bus are described.

[0017] Numerous specific details are set forth in the followingdescription in order to provide a thorough understanding of the presentinvention. It will be apparent, however, to one skilled in the art thatthe present invention can be practiced without these specific details.In other instances, well-known structures and devices are shown in blockdiagram form in order to avoid obscuring the present invention.

[0018] Some portions of the detailed descriptions that follow arepresented in terms of algorithms and symbolic representations ofoperations on data bits within a computer memory. These algorithmicdescriptions and representations are the means used by those skilled inthe data processing arts to most effectively convey the substance oftheir work to others skilled in the art. An algorithm is here, andgenerally, conceived to be a self-consistent sequence of steps leadingto a desired result. The steps are those requiring physicalmanipulations of physical quantities. Usually, though not necessarily,these quantities take the form of electrical or magnetic signals capableof being stored, transferred, combined, compared, and otherwisemanipulated. It has proven convenient at times, principally for reasonsof common usage, to refer to these signals as bits, values, elements,symbols, characters, terms, numbers, or the like.

[0019] It should be borne in mind, however, that all of these andsimilar terms are to be associated with the appropriate physicalquantities and are merely convenient labels applied to these quantities.Unless specifically stated otherwise in the following discussions, it isappreciated that throughout the present invention, discussions utilizingterms such as “processing” or “computing” or “calculating” or“determining” or “displaying” etc. refer to the action and processes ofa computer system, or similar electronic computing device. That is, adevice that manipulates and transforms data represented as physical(electronic) quantities within the computer system's registers andmemories into other data similarly represented as physical quantitieswithin the computer system memories or registers, or other suchinformation storage, transmission or display devices.

[0020] The present invention also relates to an apparatus for performingthe operations herein. This apparatus may be specially constructed forthe required purposes, or it may comprise a general-purpose computerselectively activated or reconfigured by a computer program stored inthe computer. Such a computer program may be stored in a computerreadable storage medium, such as but not limited to, any type of diskincluding floppy disks, optical disks, CD-ROMs, and magnetic-opticaldisks, read-only memories (ROMs), random access memories (RAMs), EPROMs,EEPROMs, magnetic or optical cards, or any type of media suitable forstoring electronic instructions, and each coupled to a computer systembus.

[0021] The algorithms and displays presented herein are not inherentlyrelated to any particular computer or other apparatus. Variousgeneral-purpose systems may be used with programs in accordance with theteachings herein, or it may prove convenient to construct a morespecialized apparatus to perform the required method steps. The requiredstructure for a variety of these systems will appear from thedescription below. In addition, the present invention is not describedwith reference to any particular programming language. It will beappreciated that a variety of programming languages may be used toimplement the teachings of the invention as described herein.

[0022] Overview

[0023] A mechanism for high-speed signal implementation usingdifferential reference signals is disclosed. In one embodiment, a driversends multiple groups of signals to a receiver over a set of wires. Eachgroup of signals contains a high reference signal, a low referencesignal, and multiple signals, such as, for example, four data signals.When the receiver senses the multiple groups of signals, the receiveridentifies logic values for each signal in response to the high and lowreference signals. In one embodiment, the high and low reference signalsare shared among four data signals.

[0024] In another embodiment, a group of wires carrying multiplesignals, also known as a bus, is pre-charged and equalized during thepre-charge clock cycle. In this embodiment, the wires closer to thereceiver end are pre-charged while the wires closer to the driver endare equalized. Using the circuit of pre-charge and equalization reducesbus set-up time and consequently, a higher clock frequency can beoperated.

[0025]FIG. 1 is a block diagram of one embodiment of the processing unit100. Processing unit 100 includes a bus interface 102, a cache 104, adecoder 106, a register file 108, a floating-point execution unit 112,and an integer execution unit 110. Of course, processing unit 100 maycontain additional circuitry, which is not necessary to understandingthe invention.

[0026] Integer execution unit 110, which further includes an integerarithmetic logic unit (“ALU”) 122, is used for executing integerinstructions received by processing unit 100. Integer execution unit 110performs various data manipulations including storing, fetching,addressing, and integer calculations. Integer execution unit 110 isfurther coupled to floating-point execution unit 112. In one embodiment,integer execution unit 110 includes floating-point execution unit 112.Floating-point execution unit 112 includes a floating-point ALU 120 toperform floating-point arithmetic.

[0027] Integer execution unit 110 is coupled to a register file 108 viaan internal bus 130. Register file 108 represents a storage area onprocessing unit 100 for storing information, including data. Oneembodiment of the register file 108 contains various special registers,such as machine specific registers, status registers, et cetera. Integerexecution unit 110 is further coupled to a cache 104 and a decoder 106.Cache 104 is used to cache data and/or control signals. Decoder 106 isused for decoding instructions received by processing unit 100 intocontrol signals and/or micro-code entry point. In response to thesecontrol signals and/or microcode entry point, integer execution unit 110performs the appropriate operations. Decoder 106 may be implementedusing any number of different mechanisms (e.g., a look-up table, ahardware implementation, etc.).

[0028] Bus interface 102 is used to communicate between processing unit100 and the rest of the components in the system, such as main memories,input/output devices, and system bus. Other components may be includedin processing unit 100, such as a second level cache. Processing unit100, in one embodiment, is integrated into a single integrated circuit(“IC”).

[0029]FIG. 2 illustrates one embodiment of a bus scheme 200, where itincludes a driver 209, a receiver 239, and a storage device 249. Driver209 further includes multiple driver circuits 212, 214, 216 and 218.Receiver 239 also includes multiple receiver circuits 230, 232. Storagedevice 249 also contains latch circuit 240 and latch circuit 242. In oneembodiment, driver 209, receiver 239, storage device 249, and multiplewires 222-228 may be integrated into a single integrated circuit. Otherblocks may be included in block diagram 200, but they are not importantto understanding the present invention.

[0030] In one embodiment, driver 209 receives multiple data signals andreference signals. The reference signals 202, 208 may be generated frompower supplies. In another embodiment, the reference signals 202, 208are generated by other components. When the high-reference signal 202reaches to driver circuit 212, driver circuit 212 drives high-referencesignal 202 onto a bus 222. Similarly, when the low-reference signal 208reaches to driver circuit 218, the driver circuit 218 drives thelow-reference signal 208 onto a bus 228. Driver circuit 214 receivesdata 204 and drives the data 204 onto a bus 224. Driver 216 receivesdata 206 and drives the data 206 onto a bus 226. Note that more datamaybe received by driver 209, and more data maybe driven onto the databuses.

[0031] In one embodiment, driver 209 drives four data signals and tworeference signals. In this embodiment, driver circuit 212-218 are sourcefollower drivers because the driver circuits 212-218 only need to drivesignals a few hundred millivolts to be triggered by the sensoramplifiers. An advantage to using the source follower driver isefficient driving strength with low voltage swings. Thus, using thesource follower driver reduces power consumption. Another advantage forusing the source follower driver is to receive input signals fromdynamic circuit or from another sense amplifier, which will be discussedlater.

[0032] In one embodiment, receiver 239 contains six receiver circuitswhere two receiver circuits are used to receive the high and lowreference signals and other four receiver circuits are dedicated toreceive data signals. In another embodiment, receiver circuit 230 is aP-sense amplifier receiver and is configured to sense a logical value ofthe data signal using the reference signals from buses 222 and 228. Dueto the use of the reference signals, the receiver circuit can identifythe logic value of a signal with a few hundred millivolts instead of 1.5volts, which is, in one embodiment, the full voltage level forrepresenting a logic 1 value. For example, receiver circuit 230 senses alogic value 1 if the signal on bus 224 is 105 millivolts while the lowreference signal is 5 millivolts. Also, receiver circuit 232 senses alogic value 0 if the signal on bus 226 is 10 millivolts while the highreference signal is 100 milllivolts. Since the buses 222-228 carryrelatively small amount of charge or current, the wire pitch size forthe bus can be reduced. Pitch size is measured from the width of a wireplus the width of the insulator. Moreover, the common mode noiserejection within the bus is also enhanced due to small pitch size andthe high and low reference signals.

[0033] In one embodiment, storage device 249 contains multiple latchcircuits 240-242. In another embodiment, latch circuits 240 and 242 canbe static latches. Storage device 249 latches data from receiver 239 andstores the data for the next clock cycle.

[0034] In one operation, driver 209 receives reference signals 202, 208and data signals 204, 206. After completion of receipt, driver 209drives the data signals and reference signals onto the bus. For example,driver circuit 212 receives high reference signal 202 and subsequentlydrives high reference signal 202 onto the bus 222. In one embodiment,bus 222 is 9,000 microns in length without repeaters in between.Repeater is a circuit to re-power signals. When receiver 239 receivesthe signals, receiver circuits identify the logic value for each datasignal according to the high and low reference signals. Note that blockdiagram 200 may contain more than four data signals.

[0035]FIG. 3 is a circuit diagram 300 illustrating one embodiment of asensing device. In one embodiment, circuit diagram 300 is a P-senseamplifier receiver and it contains P-MOS (“Metal Oxide Semiconductor”)P1-P5, 302-310, respectively, and N-MOS N1-N4, 312-318, respectively. P5310, N2 312, and N3 318 are used to perform pre-charge functions. N2 314and N3 316 are dedicated to output functions. P1 302 and P2 304 are usedto receive data or control signals while P3 306 and P4 308 are dedicatedto receive reference signals. In one embodiment, P1 302, P2 304, P3 306,and P4 308 are similarly sized transistors.

[0036] In one embodiment, while the source terminal of P5 310 is coupledto Vcc power supply, the drain terminal of P5 310 is coupled to node A.The gate terminal of P5 310 is coupled to gate terminal of N1 312 andthe gate terminal of N4 318. In one embodiment, the gate terminal of P5310 is also connected to a pre-charged clock. While the source terminalsof P1 302 and P2 304 are coupled to node A, the drain terminals of P1302 and P2 304 are coupled to the complement output 336. The gateterminals of P1 302 and P2 304 are coupled to input signal 330.

[0037] Also, the source terminals P3 306 and P4 308 are coupled to nodeA and the drain terminals P3 306 and P4 308 are coupled to the outputterminal 338. While the gate terminal of P3 306 is coupled to the highreference signal 332, the gate terminal of P4 308 is coupled to the lowreference signal 334. While the gate terminal of N2 314 is coupled tothe output 338, the gate terminal of N3 316 is coupled to the complementoutput 336. The source terminals of N1 312 and N2 314 are connected tothe ground power supply 350 and the drain terminals of N1 312 and N2 314are coupled to the complement output 336. The source terminals of N3 316and N4 318 are connected to the ground power supply 350 and the drainterminals of N3 316 and N4 318 are coupled to the output 338.

[0038] In one operation, when input signal is a logic 1, P1 302, P2 304,and P3 306 are off. Since the low reference signal 334 is low, whichturns on P4 308, the output 338 produces logic 1. Since output 338 islogic 1, N2 314 is on, which drives complement output 336 to zero. Whencomplement output 336 is zero, it turns off N3 316. Thus, when inputsignal is a logic 1, the output 338 outputs a logic 1.

[0039] On the other hand, if input signal is a logic 0, P1 302, P2 304,and P4 308 are all on at the same time. When P1 302 and P2 304 are bothon at the same time, N3 316 is driven to be on faster than N2 314 to beturned on. When in one embodiment N3 316 can drain more current than N2314, the output 338 is driven to a logic 0. When the output 338 is atlogic 0, N2 314 is off and subsequently the complement output 336 is atlogic 1. Thus, when input signal 330 is a logic 0, the output 338 isalso a logic 0. Note that a N-sense amplifier receiver can be derivedfrom circuit diagram 300 by replacing P-MOS with N-MOS and replacingN-MOS with P-MOS.

[0040]FIG. 4 is a block diagram 400 illustrating one embodiment of a busconfiguration with pre-charge and equalizer circuits. Block diagram 400contains an equalizer circuit 109, a pre-charge circuit 119, and areceiver 422. Block diagram 400 also contains bus A 402 and bus B 406where bus A 402 and bus B 406 are driven by a driver (not shown). In oneembodiment, receiver 422 is a sense amplifier receiver. Other componentsmay be added to block diagram 400, but they are not important tounderstanding the disclosed system.

[0041] In one embodiment, equalizer circuit 109 contains an N-typetransistor such as an N-MOS transistor. However, if an additional bus isadded in block diagram 400, at least one more transistor may be requiredin equalizer circuit 109. Moreover, the N-type transistor N1 410 ofequalizer circuit 109 may be replaced with a P-type transistor if theplurality of the pre-charge signal 404 is changed. In anotherembodiment, pre-charge circuit 119 contains two N-type transistors N2412 and N3 414. However, if an additional bus is added in block diagram400, one or more N-type transistors may be required in pre-chargecircuit 119 to perform the pre-charge function.

[0042] In one embodiment, equalizer circuit 109 is placed closer to thedriver's side of the bus while pre-charge circuit 119 is placed closerto the receiver's side of the bus for conserving power consumption.Referring back to FIG. 4, the buses are initially charged at the driverside's side and the charge is propagated from the driver's side of thebus to the receiver's side of the bus. Since the driver's side of thebus contains higher voltage levels than bus at the receiver's side,using equalizer circuit 109 closer to the driver's side of bus saves orrecycles a large amount of power. Thus, in one embodiment the equalizercircuit 109 does not discharge the charge, but equates the chargebetween the bus wires.

[0043] In one operation, the bus at the receiver end is pre-charged andthe bus at the driver's side is equalized before the driver startsdriving the bus. Equalizer circuit 109 may be repeated if the bus islong. Since the charge on the buses is not directly discharged on thedriver end, a large amount of power is saved and the speed of the bus isimproved due to less load on the bus.

[0044]FIG. 5 is a timing diagram 500 illustrating an embodiment of aprocess for implementing pseudo differential bus scheme. Timing diagram500 illustrates five clock cycles where clock cycle 1, 3, and 5 arepre-charged clock cycles and cycles 2 and 4 are data enabled clockcycles. Timing diagram 500 further illustrates a data signal 502, apre-charge and equalization signal 504, a sense enable clock 505, aninput data signal 506, a complement input data signal 508, an outputsignal 510, and a complement output signal 512.

[0045] Input data 506 and the complement input data 508 are pre-chargedat the falling edge of the pre-charge and equalization cycle 504. At therising edge of the pre-charge and equalization cycle 504, in oneembodiment input data 506 and the complement input data 508 are chargedto voltage levels where a sensing amplifier can detect a logic valuefrom the voltage levels. In one embodiment, sense enable clock 505 is asense amp enable clock, which is used to indicate when the data issampled.

[0046] The output signal 510 changes from logic 0 to logic 1 at thebeginning of the clock cycle 3 and changes from logic 1 to logic 0during the pre-charge and equalization cycle 504. In this embodiment,when input data 506 is high, output 510 is also high. If input data 506is logic 0, output 510 is also logic 0. The complement output 512 is theinverse logic value of output 510. Other signal waive forms may be addedin the timing diagram, but they are not important to understanding thedisclosed diagram.

[0047]FIG. 6 is a flowchart 600 illustrating an embodiment of a processfor implementing the pseudo differential bus scheme. The process beginsat start block and proceeds to block 602. At block 602, the processreceives at least three signals, a high reference signal, and a lowreference signal. In one embodiment, the signals are data signals. In analternative embodiment, the signals are control signals. After block602, the process proceeds to block 604, where the process identifies thevalue of the data in response to the high reference signal and the lowreference signal. After block 604, the process proceeds to block 606. Atblock 606, the values of the data signals are amplified before they arelatched by the storage device. After block 606, the process proceeds toblock 608 where the process outputs amplified values of the signal andthe outputs are stored in the storage device. After block 608, theprocess ends.

[0048] In the foregoing detailed description, the method and apparatusof the present invention have been described with reference to specificexemplary embodiments thereof. It will, however, be evident that variousmodifications and changes may be made thereto without departing from thebroader spirit and scope of the present invention. The presentspecification and figures are accordingly to be regarded as illustrativerather than restrictive.

[0049] Thus, a method and a system for implementing high-speed signalsusing a pseudo differential bus mechanism have been described.

We claim:
 1. A device comprising: a first device configured to drive atleast three signals, a first reference signal, and a second referencesignal; and a second device coupled to the first device and configuredto receive the at least three signals, the first reference signal, andthe second reference signal, the second device further configured toidentify values for the at least three signals according to the firstreference signal and the second reference signal.
 2. The device of claim1, further comprising a storage device coupled to the second device andconfigured to store output signals from the second device.
 3. The deviceof claim 2, wherein the storage device includes a plurality of staticlatches.
 4. The device of claim 1, wherein the first device is a driver,which further includes a plurality of driver circuits.
 5. The device ofclaim 1, wherein the second device is a receiver, which further includesa plurality of receiver circuits.
 6. The device of claim 5, wherein eachreceiver circuit receives a signal, a first reference signal, and asecond reference signal.
 7. The device of claim 5, wherein the receivercircuit is a P-sense amplifier receiver.
 8. The device of claim 1,wherein the first reference signal is a high reference signal and thesecond reference signal is a low reference signal.
 9. The device ofclaim 1, wherein the at least three signals are data signals.
 10. Amethod comprising: driving at least three signals, a first referencesignal, and a second reference signal; receiving the at least three datasignals, the first reference signal, and the second reference signal;and identifying values of the at least three signals according to thefirst reference signal and the second reference signal.
 11. The methodof claim 10, further comprising amplifying the values of the at leastthree data signals.
 12. The method of claim 10, further comprisingstoring the values of the at least three data signals in a storagedevice.
 13. The method of claim 10, wherein the identifying values ofthe at least three signals further includes: sensing a voltage swing ofone of the at least three signals when voltage changes approximately onehundred millivolts; and identifying a logic value corresponding to thevoltage swing of one of the at least three signals in response to thefirst reference signal and the second reference signal.
 14. The methodof claim 10, where the driving at least three signals, a first referencesignal, and a second reference signal further includes: driving thefirst reference signal to high; and driving the second reference signalto low.
 15. A system comprising: a processor; a memory coupled to theprocessor; a driver coupled to the processor and configured to drive aplurality of grouped signals, each grouped signals further including atleast three signals, a high reference signal, and a low referencesignal; and a receiver coupled to the driver and configured to receivethe plurality of grouped signals.
 16. The system of claim 15, furthercomprising a storage device coupled to the receiving device andconfigured to store outputs of the receiver.
 17. The system of claim 16,wherein the storage device includes a plurality of static latches. 18.The system of claim 15, wherein the driver includes a plurality ofdriver circuits.
 19. The system of claim 13, wherein the receiverincludes a plurality of receiver circuits.
 20. The system of claim 19,wherein the each receiver circuit receives a signal, a high referencesignal, and a low reference signal and identifies a logic value of thesignal in response to the high reference signal and the low referencesignal.
 21. A system comprising: a first device having a plurality ofwires for transporting signals; a second device coupled to the firstdevice and configured to perform a pre-charge function at a receivingend of the first device; and a third device coupled to the first deviceand configured to equalize potentials between the plurality of wires ata driving end of the first device.
 22. The system of claim 21, furthercomprising a driver coupled to the driving end of the first device andconfigured to drive the signals onto the plurality of wires.
 23. Thesystem of claim 21, further comprising a receiver coupled to thereceiving end of the first device and configured to receive the signalsacross the plurality of wires.
 24. The system of claim 23, wherein thereceiver further includes at least one sensing amplifier.
 25. The systemof claim 24, wherein the sensing amplifier receives a signal, a firstreference signal, and a second reference signal.
 26. The system of claim25, wherein the first reference signal is a high reference signal andthe second reference signal is a low reference signal.
 27. The system ofclaim 21, wherein the second device is a pre-charge circuit and thethird device is an equalizer circuit.
 28. A system comprising: a firstdevice configured to drive at least three signals, a first referencesignal, and a second reference signal onto a plurality of wires; asecond device coupled to the first device and configured to receive theat least three signals, the first reference signal, and the secondreference signal from the plurality of wires, the second device furtherconfigured to identify values for the at least three signals accordingto the first reference signal and the second reference signal; a thirddevice coupled to the second device and configured to perform apre-charge function on the plurality of wires closer to receiving end ofthe second device; and a fourth device coupled to the first device andconfigured to perform a equalizer function on the plurality of wirescloser to driving end of the first device.
 29. A method comprising:pre-charging wire potentials at a receiving end of a plurality of wires,which transports a plurality of signals across the wires, duringpre-charge phase; and equalizing wire potentials at a driving end of theplurality of wires during pre-charge phase.
 30. A circuit comprising: afirst device configured to receive a signal from a driver; a seconddevice coupled to the first device and configured to receive a firstreference signal and a second reference signal from the driver; and athird device coupled to the first block and configured to output atleast one output signal in response to the signal, the first, and secondreference signals.